Exploration of Vancouver Open Crime data

Reading in the data

TYPE YEAR MONTH DAY HOUR MINUTE HUNDRED_BLOCK NEIGHBOURHOOD X Y
Break and Enter Residential/Other 2003 08 09 00 40 15XX MARINER WALK Fairview 489839.1 5457534
Theft of Vehicle 2003 02 05 22 00 47XX JOYCE ST Renfrew-Collingwood 497984.1 5454417
Theft of Vehicle 2003 12 22 08 47 47XX KILLARNEY ST Renfrew-Collingwood 497037.3 5454333
Theft of Vehicle 2003 03 24 22 00 47XX LANARK ST Kensington-Cedar Cottage 494557.3 5454469
Other Theft 2003 12 24 12 15 2X W HASTINGS ST Central Business District 492353.4 5458773
Theft of Vehicle 2003 11 17 22 30 47XX LITTLE ST Kensington-Cedar Cottage 495337.7 5454436

How many of each type of crime do we have in our dataset?

TYPE counts
Theft from Vehicle 193009
Mischief 78418
Break and Enter Residential/Other 64213
Other Theft 59376
Offence Against a Person 58578
Theft of Vehicle 40236
Break and Enter Commercial 36722
Theft of Bicycle 28970
Vehicle Collision or Pedestrian Struck (with Injury) 24015
Vehicle Collision or Pedestrian Struck (with Fatality) 276
Homicide 240

Counts of total crime by area

Crime counts over time by type of crime

Total crime counts over time by neighbourhood

Car thefts in Vancouver

Counts of car theft reported at each hour of the day

How many neighborhoods do we have in the dataset and how many thefts from cars happened in each?


# A tibble: 25 x 1
   NEIGHBOURHOOD            
   <chr>                    
 1 Riley Park               
 2 Grandview-Woodland       
 3 Sunset                   
 4 Mount Pleasant           
 5 Kensington-Cedar Cottage 
 6 Central Business District
 7 Hastings-Sunrise         
 8 Kitsilano                
 9 Strathcona               
10 Renfrew-Collingwood      
# … with 15 more rows

# A tibble: 24 x 2
   NEIGHBOURHOOD             count
   <chr>                     <int>
 1 Arbutus Ridge              2021
 2 Central Business District 54892
 3 Dunbar-Southlands          3182
 4 Fairview                  12884
 5 Grandview-Woodland         8300
 6 Hastings-Sunrise           6459
 7 Kensington-Cedar Cottage   8203
 8 Kerrisdale                 3044
 9 Killarney                  4343
10 Kitsilano                  9923
# … with 14 more rows

Days of the week

Looking at data between 2003 and 2017 and looking at difference in the days of the week

Seasons

Looking at data between 2003 and 2017 (omitting 2018 because it is incomplete), do we see any variation between summer and winter months. Looking at the plot above it does not appear as though there is any significant difference.


# A tibble: 4 x 5
  is_summer is_winter is_fall is_spring     n
  <lgl>     <lgl>     <lgl>   <lgl>     <int>
1 FALSE     FALSE     FALSE   TRUE      43928
2 FALSE     FALSE     TRUE    FALSE     46486
3 FALSE     TRUE      FALSE   FALSE     42512
4 TRUE      FALSE     FALSE   FALSE     44943

Incredible. The number of car thefts is nearly the same across all the seasons. The lowest being in winter at 43000 and the highest being Fall at 47000. Over 14 years, that difference is nearly negligible.

Mapping theft from cars for 2004 in Vancouver using Leaflet


 NEIGHBOURHOOD            X                Y        
 Length:17835       Min.   :-123.2   Min.   :49.20  
 Class :character   1st Qu.:-123.1   1st Qu.:49.25  
 Mode  :character   Median :-123.1   Median :49.27  
                    Mean   :-123.1   Mean   :49.26  
                    3rd Qu.:-123.1   3rd Qu.:49.28  
                    Max.   :-123.0   Max.   :49.31  

Mapping theft from cars in Vancouver using ggplot2


OGR data source with driver: KML 
Source: "/Users/mohamadmakkaoui/Desktop/Code/van_car_theft_vis/cov_localareas.kml", layer: "local_areas_region"
with 22 features
It has 2 fields

       long      lat order  hole piece id group
1 -123.1641 49.25748     1 FALSE     1  0   0.1
2 -123.1639 49.25746     2 FALSE     1  0   0.1
3 -123.1636 49.25745     3 FALSE     1  0   0.1
4 -123.1626 49.25743     4 FALSE     1  0   0.1
5 -123.1603 49.25740     5 FALSE     1  0   0.1
6 -123.1579 49.25736     6 FALSE     1  0   0.1

Making a choropleth map of Vancouver

Creating our dataset with counts per neighborhood


# A tibble: 6 x 2
  NEIGHBOURHOOD                 n
  <chr>                     <int>
1 Arbutus Ridge               207
2 Central Business District  4418
3 Dunbar-Southlands           419
4 Fairview                   1295
5 Grandview-Woodland          794
6 Hastings-Sunrise            602

It appears as thought there is some descrepancy between the polygon dataset and the crime dataset when it comes to neighborhood names. Luckily, most of them are correct and will be joinable. The ones that aren’t will be merged using the aggregate function.